A Y-linked anti-Müllerian hormone type-II receptor is the sex-determining gene in ayu, Plecoglossus altivelis

Whole-genome duplication and genome compaction are thought to have played important roles in teleost fish evolution. Ayu (or sweetfish), Plecoglossus altivelis, belongs to the superorder Stomiati, order Osmeriformes. Stomiati is phylogenetically classified as sister taxa of Neoteleostei. Thus, ayu holds an important position in the fish tree of life. Although ayu is economically important for the food industry and recreational fishing in Japan, few genomic resources are available for this species. To address this problem, we produced a draft genome sequence of ayu by whole-genome shotgun sequencing and constructed linkage maps using a genotyping-by-sequencing approach. Syntenic analyses of ayu and other teleost fish provided information about chromosomal rearrangements during the divergence of Stomiati, Protacanthopterygii and Neoteleostei. The size of the ayu genome indicates that genome compaction occurred after the divergence of the family Osmeridae. Ayu has an XX/XY sex-determination system for which we identified sex-associated loci by a genome-wide association study by genotyping-by-sequencing and whole-genome resequencing using wild populations. Genome-wide association mapping using wild ayu populations revealed three sex-linked scaffolds (total, 2.03 Mb). Comparison of whole-genome resequencing mapping coverage between males and females identified male-specific regions in sex-linked scaffolds. A duplicate copy of the anti-Müllerian hormone type-II receptor gene (amhr2bY) was found within these male-specific regions, distinct from the autosomal copy of amhr2. Expression of the Y-linked amhr2 gene was male-specific in sox9b-positive somatic cells surrounding germ cells in undifferentiated gonads, whereas autosomal amhr2 transcripts were detected in somatic cells in sexually undifferentiated gonads of both genetic males and females. Loss-of-function mutation for amhr2bY induced male to female sex reversal. Taken together with the known role of Amh and Amhr2 in sex differentiation, these results indicate that the paralog of amhr2 on the ayu Y chromosome determines genetic sex, and the male-specific amh-amhr2 pathway is critical for testicular differentiation in ayu.

Introduction was phylogenetically classified into the superorder Protacanthopterygii, which contains salmon and pike [18]. More recently, however, Osmeriformes has been phylogenetically classified into the superorder Stomiati [19][20][21][22], which is sister taxa of Neoteleostei (S1 Fig). The divergence of Protacanthopterygii and the common ancestor of Stomiati and Neoteleostei is estimated to have occurred approximately 190 million years ago, and the divergence of Stomiati and Neoteleostei is estimated to have occurred approximately 180 million years ago [19,21,22]. The common ancestor of Osmeridae and Salmonidae diverged before the salmonid-specific whole-genome duplication (SaGD) event. Thus, ayu holds an important position in the fish tree of life. Comparative whole-genome analyses of ayu and other teleost fish can help to reveal aspects of chromosomal evolution in teleost fish.
Ayu are economically important for the food industry and recreational fishing in Japan. Aquaculture production of ayu was approximately 5,000 tons in 2017, the second largest inland aquaculture production in Japan [23]. In Japan, female ayu are more valuable than males as food, because mature fish with eggs are considered delicious. Thus, a genetic method to identify sex early in ayu development would have economic significance. In ayu, genetic sex is determined by an XX/XY system, and sex-linked AFLP markers have been developed [24][25][26]. However, little is known about the genomic region at the sex-determining locus in ayu.
In this study, we constructed a reference genome sequence for ayu using a whole-genome shotgun sequencing method from a one male individual and constructed genetic linkage maps using genotyping-by-sequencing (GBS)-derived single nucleotide polymorphisms (SNPs). The ayu genome sequence was analyzed to determine syntenic relationships with other fishes and to provide new insights into teleost genome evolution. We also identified sex-associated loci by genome-wide association studies using wild ayu populations. A strong candidate for the ayu sex-determining gene, a paralog of amhr2, was found to be expressed in a male-specific manner in supporting cells surrounding germ cells in morphologically undifferentiated gonads. These findings provide genetic evidence that the paralog of amhr2 on the Y chromosome is the sex-determining gene of ayu. We provide the first report for the sex-determining gene of species belong to the superorder Stomiati.

Construction of ayu reference genome
Whole-genome shotgun sequencing was carried out using one male ayu individual. Sequencing coverage was approximately 270× for short reads and 17× for long reads (S1 Table). The final genome assembly yielded 4,035 scaffolds longer than 1,000 bp. The longest scaffold was 16.8 Mb with an N50 scaffold length of 4.3 Mb. The total size of the assembly was 449.1 Mb. The genome size of ayu was estimated at approximately 420 Mb by a nuclear DNA staining method using red blood cells (S2A Fig). Thus, the total size of the assembly was relatively consistent with the estimated genome size of ayu. The genome assembly has been deposited in the DDBJ database under the accession number BNHK01000001-BNHK01004035.
To construct genetic linkage maps by pseudo-testcross mapping, GBS sequence data was obtained from an F 1 full-sib family that included 90 siblings in addition to the two parents. Sequencing coverage was approximately 3.2× for parents and 0.6× for siblings. On average, 98.9% of GBS reads were mapped onto the ayu draft genome sequence. Variant calling identified a total of 413,386 raw SNPs, which were filtered using criteria described in the Materials and methods. A total of 5,113 filtered SNPs were used to construct the genetic linkage maps in Lep-map3 [27].
A total of 3,670 segregating SNPs were successfully classified into 28 linkage groups (LGs), which is the haploid chromosome number of ayu (S2 Table) [28]. The female-specific map contained 2,262 SNPs with a total genetic distance of 2774.10 centi Morgans (cM) and an average marker interval of 1.23 cM (S2 Table and S3 Fig). The male-specific map contained 1,394 SNPs with a total genetic distance of 2782.43 cM and an average marker interval of 2.00 cM (S2 Table and Table. Scaffolds of the ayu genome assembly were anchored to the genetic linkage maps using ALLMAPS [29]. In total, 90.7% of the genome sequences (408.3 Mb) were anchored to linkage maps, and 72.4% of them were oriented on the chromosome. Additionally, 9.3% of the genome sequences (42.0 Mb) were unplaced. Assembly completeness was evaluated by BUSCO v3.0.1 [30]. The acti-nopterygii_odb9 (4,584 BUSCOs) was 94.0% complete (90.1% unique and 3.9% duplicated), 2.7% fragmented, and 3.3% missing. Augustus v3.3.3 predicted 32,283 coding sequences (S4 Table) [31]. Repetitive elements in the ayu genome accounted for 56.78 Mb (12.61% of the ayu genome assembly) using the fugu repeat database and 65.15 Mb (14.47%) using the zebrafish database as predicted by RepeatMasker v4.0. 5 (S2B and S2C Fig). RepeatMasker with the same parameters predicted fewer transposable elements in the ayu genome than in the genomes of rainbow trout, Atlantic salmon, northern pike and medaka.

Conserved synteny analysis
To identify traces of chromosomal rearrangement in teleosts, especially Stomiati and Protacanthopterygii, we compared the ayu genome with those of other fishes. Summary for phylogenetic classification of species using conserved synteny analysis is shown in S1 Fig. Medaka (Oryzias latipes, Ola) belongs to the superorder Neoteleostei and the order Beloniformes. Medaka preserved its ancestral karyotype after the Teleost-specific whole-genome duplication (TGD) event [32]. The divergence of the common ancestor of ayu and medaka was estimated to have occurred approximately 180 million years ago [19,21,22]. The haploid chromosome number of medaka is 24 [32]. Synteny analysis between ayu LGs and medaka chromosomes (chr) revealed large blocks of orthologous genes with an almost one-to-one correspondence ( Fig 1A and S6A and S7A Figs). However, medaka chr Ola1 corresponded to the ayu LGs Pal2 and Pal12; Ola3 corresponded to Pal10 and Pal13; Ola7 corresponded to Pal11 and Pal21; Ola9 corresponded to Pal8 and Pal25; Ola12 corresponded to Pal12 and Pal15; Ola14 corresponded to Pal17 and Pal19; and Ola20 corresponded to Pal24 and Pal26. Four chromosome fissions and fusion of a pair of chromosomes were detected as major chromosomal rearrangements.
Northern pike (Esox lucius, Elu) belongs to the superorder Protacanthopterygii and the order Esociformes and is phylogenetically classified as sister taxa of salmonids [19,21,22]. The divergence of the common ancestor of Protacanthopterygii and Stomiati was estimated to have occurred approximately 190 million years ago [19,21,22]. The lineages of Esociformes and Salmoniformes diverged at 75 million years ago before the SaGD event [11,19,21,22]. The haploid chromosome number of northern pike is 25 [11]. Synteny analysis between ayu and northern pike revealed large blocks of orthologous genes with an almost one-to-one correspondence (Fig 1B and S6B and S7B Figs). However, the ayu LG Pal12 corresponded to Elu14 and Elu25; Pal8 and Pal25 corresponded to Elu13; Pal12 and Pal15 corresponded to Elu14; and Pal17 and Pal19 corresponded to Elu7. As major chromosomal rearrangements, one chromosome fission and three fusions were detected.
Zebrafish (Danio rerio, Dre) belongs to the order Cypriniformes. The ayu and zebrafish lineages are estimated to have diverged approximately 250 million years ago [19,21,22]. The TGD event occurred before the divergence of the zebrafish and medaka lineages [34][35][36]. The haploid chromosome number of zebrafish is 25 [37]. The ayu LGs Pal12, Pal16, Pal17, Pal19 and Pal20 were aligned to multiple chromosomes of the zebrafish genome, suggesting that these chromosome differences were due to translocations mostly in the zebrafish lineage. The ayu LGs Pal3, Pal5, Pal8, Pal9, Pal15, Pal21, Pal25 and Pal28 showed one-to-two syntenic relationships with zebrafish chromosomes. Other ayu LGs showed roughly one-to-one syntenic relationships with zebrafish chromosomes. Half of zebrafish Dre4 did not show any syntenic relationship with the ayu genome, as expected from its high content of repetitive elements, its heterochromatic nature, and its lack of recombination ( Fig 1E and S7E Fig) [38,39]. Low similarity to Dre4 has also been reported in a comparison between Zebrafish and common carp [40].
Spotted gar (Lepisosteus oculatus, Loc) belongs to Holostei, the sister group of teleosts that diverged from teleosts before the TGD event [41,42]. The divergence of the common ancestor of ayu and spotted gar was estimated to have occurred approximately 300 million years ago [19,21,22]. The spotted gar genome is conserved from a vertebrate ancestor [43]. The haploid chromosome number of spotted gar is 29 [43]. Fifteen spotted gar LGs showed one-to-two syntenic relationships with ayu LGs. Orthologous blocks of spotted gar LGs Loc1-7 and Loc10 were detected in multiple ayu LGs. Orthologous chromosomes of spotted gar small LGs Loc28 and Loc29 were not detected in the ayu genome ( Fig 1F and S7F Fig).

Genome-wide association mapping to identify sex-determining locus
To explore the sex-determining locus of ayu, male and female ayu from wild populations were genotyped by GBS and whole-genome resequencing. Ayu were captured from wild populations in the Edogawa River and the Tama River. Both rivers run into Tokyo bay, and their estuaries are approximately 20 km apart as the crow flies (S8 Fig). Thus, we expected small genetic differences between the two populations because ayu is a diadromous fish. GBS data were obtained from 24 males and 24 females captured from the wild population in the Edogawa River. The mean sequencing coverage was approximately 0.7x. Whole-genome resequencing data were obtained from 10 males and nine females captured from the wild population in the Tama River. The mean sequencing coverage was approximately 12x (S5 Table). On average, 99.1% of resequencing data mapped onto the ayu reference genome sequence. We detected 2,238,473 raw SNPs by variant calling using data from both the Edogawa River population and the Tama River population. After filtering data from 34 males and 33 females genotyped by GBS and resequencing, 25,118 high-quality bi-allelic SNPs were retained (S9A Fig). The SNP density tended to be higher towards the center of chromosomes. The mean linkage disequilibrium between SNP pairs measured using r 2 across the whole genome was estimated at 0.0224 (S.D. = 0.0440) (S6 Table and S10 Fig). The LG Pal11 had the highest average linkage disequilibrium (r 2 = 0.0595). The r 2 value of Pal11 was significantly higher than those of other genomic regions (Mann-Whitney U test. P = 2.2 × 10 −16 ). Using these SNPs, the pairwise Fst between the Edogawa River population and the Tama River population was estimated to be 0.0198, suggesting that population differentiation between the two populations was small.
To reveal the population structure of ayu, we conducted principal component analyses (PCA). On the basis of k-means clustering of PCA scores, the Edogawa River population and the Tama River population were separated on PC1 (S11A and S11D Fig). Individuals within each population were further separated on PC2 (S11J Fig), suggesting that there is evidence of structure within the populations. The phenotypic sexes of the Edogawa River population were well separated on PC4, but those of the Tama River population were not as well separated (S11B, S11E, S11H, S11K and S11N Fig). PC3, PC5, and PC6 could not explain the genetic variances of the populations or phenotypic sexes (S11B and S11C Fig).
A genome-wide association scan with the "1-dom-ref" model of GWASpoly [44] detected 15 SNPs with significant sex differences at the genome-wide level. This analysis was conducted using data from 34 males and 33 females from the Edogawa River and Tama River populations. Ten SNPs were located on unanchored scaffold A (GenBank accession number BNHK01000104), four SNPs were located on unanchored scaffold B (BNHK01000131), and one SNP was located on unanchored scaffold C (BNHK01000134) (Fig 2A and S12 Fig). One SNP at position 849,395 on scaffold A showed the most significant association with sex (p = 5.18 × 10 −27 ). The lengths of the sex-linked scaffolds were 869.3 kb (scaffold A), 596.3 kb (scaffold B) and 568.8 kb (scaffold C).
To confirm that the loci detected by the genome-wide association study were sex-linked, we searched for evidence of genetic divergence between males and females using Fst as an index with 444,151 SNPs detected by whole-genome resequencing of 10 males and nine females in the Tama River population (S9B Fig). Genomic regions with high genetic differentiation (Z score of Fst > 10) between sexes were detected in scaffolds A and C (Fig 2B). The highest Fst was for 200,001-300,000 bp in scaffold C (Weir and Cockerham weighted Fst = 0.47, Z score of Fst = 19.49). The mean DNA nucleotide diversity index π in scaffold C was estimated to be 0.000213, which was higher than mean of whole genome 0.000343, suggesting that mutations accumulated in scaffold C (Welch's t test p = 0.003).
To validate genome-wide association scan for sex determining locus of ayu, we performed linkage analysis of genetic sex using ayu F 1 full-sib families captured from a wild population in the Nagaragawa River. Mapping families were generated from artificial fertilization of two pair of ayu and the siblings were mixed and reared until the phenotypic sex could be identified. GRAS-Di (Genotyping by Random Amplicon Sequencing-Direct) sequence data was obtained from two F1 full-sib families that included 88 siblings (44 females and 44 males) and two pair of parents. Sequencing coverage was approximately 5.2x for parents and 2.6× for siblings. In F 1 siblings, 30 females and 31 males were assigned to family 1 and 14 females and 13 males were assigned to family 2 by SNP-based parentage assignment. The male-specific map of family 1 identified 33 linkage groups and contained 1,978 SNPs with a total genetic distance of 1398.30 cM and an average marker interval of 0.71 cM (S7 Table and Table and S15  Fig). Simple interval mapping for genetic sex identified a significant peak on linkage group PalSex2 (LOD score 8.12) (S16 Fig). Position of the sex associated linkage group in the genome assembly was shown in S16B Fig. Compared to family 1, sex associated linkage group was divided into two linkage groups in family 2 (referred to PalSex and PalSex2). The 1.5 LOD support interval was from 8.0 to 14.0 cM. SNPs located in Scaffold A, Scaffold B and Scaffold C Male-associated SNPs were detected in three scaffolds. X-axis: linkage groups and scaffolds. Y-axis: −log 10 (p-value) based on association test. Red line indicates genome-wide significance threshold (Bonferroni corrected p-value = 0.05). (B) Genomic scan for evidence of genetic divergence between males and females using SNPs detected by whole-genome resequencing. Each dot represents Z scores of Fst values calculated for a 100-kb windows in 50-kb steps along the genome. Scaffolds in which male-associated SNPs were located by genome-wide association study also had the highest Fst values. that were detected as sex-links scaffold by GWAS using the Edogawa River and the Tama River population were also associated with genetic sex by linkage analysis using the ayu derived from the Nagaragawa River population.

Identification of ayu sex-determining gene candidates
To explore ayu sex-determining gene candidates, we analyzed the three sex-linked scaffolds in detail using linkage disequilibrium analysis and scaffold-wide association tests. In general, the sex-linked SNPs showed strong linkage disequilibrium [9]. The SNP intervals detected from the GBS data were too wide for detailed analyses, so we used SNPs with a minor allele frequency >20% detected by whole-genome resequencing of 10 males and nine females from the Tama River population (S9 Fig). We detected 1,033, 221 and 228 SNPs in the sex-linked scaffolds A, B and C, respectively. In scaffold A with 1,033 SNPs, p-values for association tests between SNP genotype and sex tended to be higher toward the 3 0 end of the scaffold (Fig 3A). Four SNPs were associated at a scaffold-wide level of significance (Bonferroni-corrected pvalue < 0.05; p < 4.8 × 10 −5 ). Similarly, in scaffold B with 221 SNPs, associations between SNP genotypes and sex were detected at the 3 0 end of the scaffold (Fig 3B). Two SNPs were associated at a scaffold-wide level of significance (p < 2.3 × 10 −4 ). Associations between phenotypic sex and blocks with strong linkage disequilibrium were detected throughout scaffold C using 228 SNPs (Fig 3C). Seventeen SNPs were significantly associated with sex across scaffold C (p < 2.2 × 10 −4 ). In scaffold C, 10.02% of sequences were predicted to be transposable elements by RepeatMasker by comparison with the fugu repeat database, whereas 1.70% and 2.25% of sequences were predicted as transposable elements in scaffold A and scaffold B, respectively (mean across the whole genome, 1.14%).
In ayu, sex-reversal experiments indicate that genetic sex is determined by an XX/XY system [24]. To identify Y chromosome-specific regions in sex-linked scaffolds, we compared mapping coverage of whole-genome resequencing between males and females in the Tama River population. In scaffold A, Y chromosome-specific regions of resequencing reads were mapped for males (XY) but not females (XX), and were detected at approximately 775-783, 805-812, 834-839, and 858-869 kb (S17A Fig, blue line). In scaffold B, Y chromosome-specific regions were detected at approximately 559-565 kb (S17B Fig, blue line). In scaffold C, Y chromosome-specific regions were detected at approximately 0-52, 146-158, 228-251, and 384-454 kb (Fig 4A, blue line). These putative Y chromosome-specific regions contained 23 putative coding sequences. Aside from transposon-related coding sequences, 10 coding sequences were identified as candidates for the ayu sex-determining gene: pgta, encoding geranylgeranyl transferase type-2 subunit alpha; eaa5a, encoding excitatory amino acid transporter 5a; amhr2, encoding anti-Muellerian hormone type-2 receptor; eaa5b, encoding excitatory amino acid transporter 5b; gtd2a, encoding general transcription factor II-I repeat domain-containing protein 2A; sgta, encoding small glutamine-rich tetratricopeptide repeat-containing protein alpha; thop1, encoding thimet oligopeptidase; syde2, encoding rho GTPase-activating protein SYDE2; wdr63, encoding WD repeat-containing protein 63; and gels encoding gelsolin (S9 Table). Neither the sex-determining gene of salmonids, sdY, nor its parent gene, irf9, were located in this region. This was expected, because these genes are also absent from the genome of northern pike [8,11].
We ruled out nine of the putative candidate sex determining genes using the following lines of evidence. Genomic PCR analyses using DNA extracted from four males and four females suggested that gtd2a and wdr63 were male specific; eaa5a, sgta and thop1 had some Y-specific insertions and deletions in intron regions; and pgta, eaa5b, syde2 and gels were not male specific (S18 Fig). In RT-PCR analyses of differentiated testes and ovaries, eaa5b mRNA was detected in testes but not ovaries, gtd2a and wdr63 mRNAs were not detected in gonads, and mRNAs of the other six candidate genes were detected in both testes and ovaries (S19 Fig). The gonads could not be dissected from ayu at the early sex differentiation stage, because the body size was too small. We focused on the predicted coding sequences showing similarity to amhr2 as a strong candidate for the ayu sex-determining gene, because amh signaling is critical for sex differentiation among vertebrates and amhr2 is known to be the sex-determining gene in fugu and yellow perch [2,[10][11][12][13][14]45].
Reference genome sequence was constructed from one XY male ayu individual. Thus, sexlinked scaffolds were probably a chimera of Y chromosome and X chromosome. To confirm sequence of sex-linked scaffold around strong candidate for the ayu sex-determining gene, we screened and sequenced a 66 kb BAC (Bacterial Artificial Chromosome) clone carrying sexlinked amhr2. Comparison of mapping coverage of whole-genome resequencing between males and females in the BAC clone sequence reveled that sex-linked amhr2 was certainly male-specific (S1 Text and S20A Fig). We then examined the BAC clone carrying sex-linked amhr2 by linkage disequilibrium analysis and scaffold-wide association tests by whole-genome resequencing of 10 males and nine females from the Tama River population. Associations between phenotypic sex and blocks with strong linkage disequilibrium were detected throughout sex-linked BAC clone using 240 SNPs with a minor allele frequency > 20% (S20B Fig). Eleven SNPs were significantly associated with sex in the BAC clone (p < 2.1 × 10 −4 ). In addition, the order of predicted genes around Y-linked amhr2 was the same between scaffold C and BAC clone (S20C Fig). Thus, we considered that whole genome sequences were reliable and carried out the subsequent analysis. Further studies will need to obtain accurate sequence of Y chromosome and X chromosome such as whole genome shotgun sequence of XX individual and YY individual.

Molecular cloning of candidate sex-determining gene
We obtained the cDNA fragment of the Y-linked amhr2 by RT-PCR using primers designed based on the predicted coding sequence in male-specific regions of scaffold C. Then, 5 0 and 3 0 rapid amplification of cDNA ends (RACE) yielded a 2017-bp full-length cDNA composed of 11 exons encoding 494 amino acids (GenBank accession number LC512011; Fig 4B). The amhr2Y gene was located at 398-403 kb in a male-specific region of scaffold C. Next, PCR analyses of the 12 males and 12 females used in the association study confirmed that amhr2Y is Y chromosome-specific (S21 Fig). In addition to the sex-linked amhr2Y, we blasted the cDNA sequence of amhr2Y against the ayu genome sequence and found an autosomal amhr2 located at 5.09 Mb in LG Pal26. The cDNA fragment of autosomal amhr2 was obtained by RT-PCR using primers designed based on the predicted coding sequence in LG Pal26. Subsequently, the full-length cDNA of autosomal amhr2 was cloned by 5 0 and 3 0 RACE. The full-length cDNA of autosomal amhr2 was 2,298 bp long, and encoded 509 amino acids (GenBank accession number LC512012). The autosomal amhr2 was named amhr2a and the Y-linked amhr2 was named amhr2bY. In RT-PCR analyses, amhr2bY mRNA was detected in testes but not ovaries of immature individuals (5 months post fertilization). Transcripts of amhr2a were detected in both testes and ovaries ( Fig 4C).
In a molecular phylogenetic analysis using the maximum-likelihood method, the Amhr2bY and Amhr2a in ayu clustered with Amhr2 of other teleosts and were most closely related to those in Northern Pike, Rainbow trout, and Atlantic salmon. These results indicated that the duplication giving rise to Amhr2bY was an ayu-specific event (S22 Fig). The deduced amino acid sequences of Amhr2bY and Amhr2a shared 84% similarity. According to homology with human AMHR2 (Protein Data Bank ID: Q16671) [46], activin type I and II receptor domains (S23 Fig, blue

Genomic comparison of amhr2bY and autosomal amhr2a
To obtain insight into the Y-specific duplication of amhr2, we compared genomic regions located in and around amhr2bY on scaffold C and autosomal amhr2a on LG Pal26. Pairwise genome alignments of the 30-kb genomic sequence around amhr2bY and amhr2a revealed that approximately 10 kb of the 5 0 region immediately upstream and 10 kb of the 3 0 region immediately downstream of the genes were dissimilar (S25 Fig). Prediction of transcription factor binding sites detected 12,619 binding sites for 566 transcription factors in the 5 0 intergenic region between amhr2bY and its adjacent gene (putative promoter region, 7,323 bp) (S10 Table). Comparisons of the regulatory regions of amhr2s among ayu, northern pike and medaka, revealed 26 transcription factors that regulated only amhr2bY (S26 Fig).
To obtain traces of Y-specific duplication of amhr2, we compared conserved synteny around amhr2bY and amhr2a in ayu with those of other fishes. Syntenic blocks including amhr2a on LG Pal26 were evolutionarily conserved with chromosome Elu17 in northern pike, Omy17 in rainbow trout, Ssa12 in Atlantic salmon, Ola5 excluding amhr2 in medaka and Loc4 in spotted gar ( Fig 5A). The syntenic block including pcbp2 to col2a1 adjacent to the conserved syntenic block containing amhr2a was not detected in Pal26 in ayu. No conserved syntenic relationships were detected between Pal26 and Ola7 (containing amhr2) in medaka. Conserved synteny analyses using whole genomes suggested that the orthologous chromosome of Elu17, Omy17, Ssa12, Ola5 and Loc4 was Pal20, not Pal26, in ayu (Fig 1 and S7 Fig). In medaka, amhr2 located on chromosome Ola7 was not located in the syntenic block conserved between fish on Ola5. In medaka, comparative gene mapping revealed that Ola5 and Ola7 are ohnologs of proto-chromosome 3 in the common ancestor of teleost fish before the TGD event [49]. The orthologous chromosome of Ola5 was Pal20 in ayu and that of Ola7 was Pal21 ( Fig 1A and S7A Fig). As expected, the conserved syntenic block including pcbp2 to col2a1 that was not detected in Pal26 was located on Pal20 and Pal21 in ayu ( Fig 5B). These regions also showed conserved syntenic relationships with Elu17 and Elu12 in northern pike and Loc4 in spotted gar.
Syntenic blocks surrounding amhr2bY on sex-linked scaffolds were evolutionarily conserved with Elu8 in northern pike, Omy1 and Omy5 in rainbow trout, Ssa10 and Ssa16 in Atlantic salmon and Ola4 in medaka ( Fig 5C). Conserved synteny analysis using whole genomes revealed that Elu8, Omy1, Omy5, Ssa10, Ssa16 and Ola4 were orthologous chromosomes of ayu Pal11 and Pal28 (Fig 1 and S7 Fig). The amhr2bY-containing region in the ayu genome did not contain three out of 10 genes conserved among northern pike, rainbow trout and Atlantic salmon. Instead of these conserved genes, amhr2bY, ticam1 (encoding TIR domain-containing adapter molecule 1-like), an extra copy of eaa5, gtd2a, and putative transportable element-related coding sequences were located in the amhrb2bY-containing region in the ayu genome.

Expression of amhr2bY during early gonadal differentiation
The Y-linked sex-determining gene functions as the first trigger of differentiation from sexually undifferentiated gonads to testes. If amhr2bY indeed determines the genetic sex of ayu, it should be expressed in sexually undifferentiated XY gonads during early sex differentiation. To investigate the expression of amhr2bY and amhr2a in sexually undifferentiated gonads, we first histologically observed the gonadal development of ayu under our rearing conditions after artificial fertilization. The gonad is located in the body cavity between the coelomic epithelium and the gut (S27 Fig  To determine the cell type of amhr2bY-expressing cells, we compared the expression patterns of amhr2bY and marker genes for gonadal differentiation. The germ-cell marker gene ddx4 (DEAD-box helicase 4, sometimes called vasa) is essential for development and differentiation of germ cells in many teleosts [51,52]. We detected amhr2bY mRNA in somatic cells surrounding ddx4-positive germ cells in morphologically undifferentiated gonads of genetic males, but not females, at 2 mpf (Fig 6C and 6F). sox9b (sometimes called sox9a2) is expressed in undifferentiated supporting cells surrounding germline stem cells in both males and females and is critical for maintaining sexually undifferentiated germline stem cells in medaka [53,54]. We detected sox9 binding sites in the putative regulatory regions of amhr2bY and amhr2a (S10 Table and S26 Fig). In ayu, sox9b mRNA was detected in somatic cells of morphologically undifferentiated gonads of both genetic males and females at 2 mpf (Fig 6H and 6K). Transcripts of amhr2bY and sox9b co-localized in somatic cells of morphologically undifferentiated male gonads at 2 mpf (Fig 6I and 6L). Both amhr2bY and autosomal amhr2a mRNAs co-localized in somatic cells of morphologically undifferentiated male gonads at 2 mpf (Fig 6O and  6R). Transcripts representing amh, the putative ligand of amhr2, were detected in both somatic amhr2. (B) Traces of teleost-specific whole-genome duplication and ayu-specific translocation of region containing amhr2a. The region containing pcbp2 to col2a1, the adjacent syntenic block to the region containing amhr2, was located in linkage group Pal20 and Pal21 in ayu. These regions showed conserved syntenic relationships to chromosome Elu12 and Elu17 in northern pike, Ola5 and Ola7 in medaka and Loc4 in spotted gar. (C) Conserved syntenic relationships of putative sex-determining locus of ayu. Syntenic blocks surrounding amhr2bY on sex-linked scaffolds (scaffold C) were evolutionarily conserved with Elu8 in northern pike, Omy1 and Omy5 in rainbow trout, Ssa10 and Ssa16 in Atlantic salmon and Ola4 in medaka. Region containing amhr2bY in ayu did not contain three of ten genes conserved among northern pike, rainbow trout and Atlantic salmon. TE indicates putative transposable element. Sp7: transcription factor Sp7, tarbp2: RISC-loading complex subunit tarbp2, tespa1: thymocyte expressed, positive selection associated 1, rpb1: putative DNA-directed RNA polymerase II subunit RPB1-like isoform, sp1: transcription factor Sp1, cdca7l: cell division cycle-associated protein 7-like, chd6: chromodomain-helicase-DNA-binding protein 6-like, rbl1: retinoblastoma-like protein 1 isoform X3, ptprt: receptor-type tyrosine-protein phosphatase T, ncoa2: nuclear receptor coactivator 2, prr13: proline-rich 13, pcbp2: poly(rC) binding protein 2, map3k12: mitogen-activated protein kinase kinase kinase 12, myg1: melanocyte proliferating gene 1, lrrc3: leucine-rich repeat-containing protein 3 cells and ddx4-positive germ cells of morphologically undifferentiated gonads in both sexes (Fig 7J-7O). Transcripts of both amhr2bY and amh were detected in the same somatic cells in male gonads at 2 mpf (Fig 7A-7C). Transcripts of amhr2a and amh were colocalized in somatic cells in both males and females at 2 mpf (Fig 7D-7I).

Loss-of-function analysis of amhr2bY
To obtain direct evidence that Y-linked amhr2bY is the sex determining gene of ayu, we examined the gonadal development of the loss-of-function mutants for amhr2bY using CRISPR/ Cas9 system in G0 generation. CRISPR-mediated amhr2bY mutations led to a frame shift (Fig 8A-8C). In general, primordial germ cells in female gonad proliferate by mitosis and then begin meiotic development during early gonadal differentiation [50]. On the other hand, primordial germ cells enter mitotic arrest in male gonad during this developmental period [15]. In ayu, number of germ cells in female gonads was also larger than males, so that the size of female gonads was bigger than males during early gonadal development (Fig 8D and 8E). In amhr2bY-knockout XY individuals, gonads were differentiated to female-like gonad at 3 mpf (Fig 8A and 8B). At 6 mpf, gonads of amhr2bY-knockout XY individuals, were differentiated to ovaries (Fig 8C). Diplotene stage oocytes and ovarian cavity were observed. A total of 160 injected individuals were analyzed. 81 individuals were genetically male (XY). Mutations by CRISPR/Cas9 were detected in 31 individuals of them. Seven individuals with mutation were induced to male-to-female sex reversal. Mutations by CRISPR/Cas9 were only induced in amhr2bY and off-target mutations in amhr2a were not detected.

Development of PCR-based genotyping method for genetic sex identification
In Japan, female ayu are more economically important than males as food. Therefore, there is a need for genetic sex identification of ayu in aquaculture. Previously developed AFLP markers were for locations outside the sex-determining scaffolds and were not completely linked to the sex phenotype. We developed a PCR-based genotyping method for genetic sex identification using amhr2bY and amhr2a in ayu. Two bands corresponding to amhr2bY and autosomal amhr2a were amplified from genetic males, and one band of autosomal amhr2a was amplified from genetic females (S30 Fig). To confirm the accuracy of genotyping method by amhr2bY, we genotyped ayu from various populations and analyzed the correlation between genetic sex and phenotypic sex. A total of 387 males and 324 females were genotyped, and genetic sex and phenotypic sex matched in more than 99% of ayu individuals (S11 Table). Three XY female individuals and one XX male individual were observed, suggesting that sex reversal by environmental factors occurred in the ayu populations. Our genotyping method can be applied in ayu aquaculture for all-female production. In ayu, XX males can be produced by artificial sex hormone treatment [55]. All-female population can produce by mating XX females and XX males. Thus, this method will contribute to the development of the aquaculture industry.

Ayu genome evolution
In this study, we obtained a draft genome sequence for ayu and analyzed syntenic relationships between ayu and other teleost fish. Ayu belongs to the superorder Stomiati and the order Osmeriformes, and Stomiati is phylogenetically classified as sister group of the superorder Neoteleostei [19][20][21][22]. The ayu genome size was estimated to be approximately 420 Mb by genome assembly and nuclear DNA staining. Genome size was estimated approximately 800 Mb in medaka and approximately 900 Mb in northern pike [32,56]. Our results indicated that genome compaction has occurred in the ayu lineage. In our annotation of the ayu genome, the density of predicted genes was 71.7 coding sequences/Mb, higher than that in medaka (32.2 genes/Mb) and northern pike (26.5 genes/Mb). This suggests that the ayu genome has compacted via reductions in the lengths of intergenic and intronic sequences, as occurred in the fugu genome [57]. Rainbow smelt (Osmerus mordax), a closely related species that also belongs to the order Osmeriformes and the family Osmeridae, has an estimated genome size of 690 Mb [58]. These results suggest that ayu genome compaction occurred after the divergence of the common ancestor of ayu and rainbow smelt. Of course, it is also possible that genome compaction occurred in a common ancestor and then the size of the rainbow smelt genome increased. Therefore, comparative genome analyses with other smelt fish in the order Osmeriformes will be useful to shed light on genome compaction in ayu.
The genetic linkage map of ayu contained 28 large LGs. The karyotype of the common teleost ancestor after TGD was 24, and medaka has retained this ancestral karyotype [32]. Comparative genomic analyses of medaka and ayu showed that 19 LG pairs retained one-to-one relationships, but four chromosome fissions and fusion of a pair of chromosomes have occurred as major chromosomal rearrangements. Thus, fission and fusion of ancestral chromosomes may have contributed to chromosomal rearrangements in ayu (Fig 1A and S6A and  S7A Figs). Northern pike is phylogenetically classified as sister group of salmonids. The common ancestor of northern pike and Salmonidae diverged before the SaGD event [11,19,21,22]. Conserved synteny analysis of ayu and northern pike showed that 22 LG pairs retain one-toone relationships with some minor translocations. In addition, one ayu to two northern pike relationships were detected as major chromosomal rearrangements (Fig 1B, S6B and S7B  Figs). Ayu LG Pal8 and Pal25 corresponded to northern pike Elu13, Pal12 and Pal15 corresponded to Elu14, and Pal17 and Pal19 corresponded to Elu7. These three one-to-two relationships were also detected in the comparison between ayu and medaka. These results suggest that these chromosome fissions are ayu-lineage specific and occurred after the divergence of ayu and northern pike. In comparative genomic analyses of ayu and salmonids, 23 LG pairs between ayu and rainbow trout showed the expected one-to-two relationships and 22 LG pairs between ayu and Atlantic salmon also showed one-to-two relationships (Fig 1C and 1D and  S7C and S7D Fig). However, some LG pairs showed one ayu-to-multiple salmonid chromosome relationships, indicative of fission and translocation of ancestral chromosomes. Thus, comparative genomic analyses among ayu, medaka and Protacanthopterygii indicate that the features of the ayu genome differ from those of Protacanthopterygii genomes. Ayu-specific chromosomal rearrangements are considered to reflect the phylogenic position of Stomiati, which is sister taxa of Neoteleostei. European smelt, Osmerus eperlanus belongs to the order Osmeriformes, and its karyotype is 2n = 56, the same as that of ayu [59]. Synteny analyses using other smelt fish are important to determine whether the chromosome rearrangements found in ayu are ayu-specific or common features in the superorder Stomiati. Previously, Osmeriformes was phylogenetically classified into superorder Protacanthopterygii. Ayu and salmonid have much in common in their external morphology such as adipose fin and life cycle such as migration between fresh water and sea and adaptation for cold water. However, the haploid chromosome number of ayu is 28, northern pike is 25 and Atlantic salmon and rainbow trout are 29. Few studies have investigated the genome evolution from a common ancestor to Stomiati and Protacanthopterygii to explain the difference of karyotype of these species. We provided first evidence that Stomiati-specific chromosomal rearrangements.

Association mapping to identify ayu sex-determining locus
We conducted a genome-wide association study using wild ayu populations to detect sex-associated SNPs with genome-wide significance. These SNPs belonged to three scaffolds that were not anchored to linkage maps. Of the three sex-linked scaffolds, scaffold C showed the most significant association with sex in a scaffold-wide association test. We found amhr2 located at positions 398-403 kb in male-specific regions of scaffold C. Another candidate, eaa5b (excitatory amino acid transporter 5b, as known as slc1a7b (solute carrier family 1 member 7b)), was also located in a Y-specific region around amhr2 on a male-linked scaffold and was expressed in testes but not in ovaries. eaa5 encodes a sodium-dependent glutamate transporter that is mainly expressed in axon terminals of rod and cone photoreceptors in the retina in mammals [60,61]. There are no previous reports that eaa5 plays a role in gonadal sex determination/differentiation. Among the other candidate coding sequences, none have been reported to be involved in gonadal sex determination/differentiation. Thus, we concluded that amhr2 on the Y chromosome is a strong candidate for the ayu sex-determining gene.

Duplication of amhr2 in ayu
We cloned two amhr2 genes from ayu. amhr2bY was located at the male-specific sex-determining locus and was expressed in the testis, while amhr2a was located on autosomal LG Pal26, and was expressed in both the testis and ovary. A pairwise alignment of amhr2Y and autosomal amhr2 with 5 0 and 3 0 putative regulatory regions and prediction of transcription factor binding sites in 5 0 upstream regions revealed similarities only in the exon regions. No similarities were detected in the upstream, downstream and intron regions.
Conserved synteny analysis revealed that amhr2 contained a genomic region derived from proto-chromosome 3 of the common ancestor of teleost fish before the TGD event. Protochromosome 3 duplicated and became the orthologous chromosome Ola5 in medaka, Pal20 in ayu, Elu17 in northern pike, Omy17 and Omy7 in rainbow trout, and Ssa12 and Ssa22 in Atlantic salmon and Ola7, Pal21, Elu12, Omy9, Omy16, Ssa13 and Ssa15 in the TGD event. It is considered that the common ancestor of medaka and ayu after the TGD event had two amhr2 genes. The medaka lineage retained the orthologous chromosome Ola7 containing amhr2 and lost Ola5. Ayu and the Protacanthopterygii lineage retained the orthologous chromosome Ola5 containing amhr2. Moreover, the amhr2-containing region was translocated from Pal20 to Pal26 in the ayu lineage. This translocation was not detected in northern pike and Salmonidae, suggesting that translocation of the amhr2a-containing region in ayu occurred after the divergence between the ayu and the Salmoniformes/Esociformes lineage.
Syntenic blocks surrounding amhr2bY on the sex-linked scaffold were evolutionarily conserved with Elu8, Omy1, Omy5, Ssa10, Ssa16 and Ola4. These were orthologs of ayu LGs Pal11 and Pal28. Pal11 had the highest linkage disequilibrium among the LGs in ayu, suggesting that sex-linked scaffolds may be mapped to Pal11. The highest linkage disequilibrium in Pal11 may reflect recombination suppression between the X and Y chromosomes. The region surrounding amhr2bY in ayu was evolutionarily conserved with regions in northern pike, rainbow trout, Atlantic salmon and medaka, while amhr2bY was detected only in ayu and not in other teleost fish. In addition, comparisons of the regulatory regions of amhr2bY and amhr2a could not detect similarity. More putative transposable elements were detected around amhr2bY than the average number detected across the whole genome (2-sample test for equality of proportions p < 0.001). These results raise the hypothesis that amhr2bY duplicated by itself. Of course, there are a possibility that amhr2bY duplicated with some surrounding genes and these surrounding genes have been lost. Amhr2bY obtained its function as a sex-determining gene as a result of autosomal amhr2a duplication and translocation. It is considered that transposable elements around the paralog of amhr2 on the Y chromosome triggered translocation. It is assumed that amhr2bY translocated with some of the surrounding region containing existing gonad-specific regulatory elements to a new position during the amhr2 duplication-transposition event, because amhr2bY was expressed in testicular supporting cells. One hypothesis is that the Y copy of amhr2 acquired novel regulation at its new location. The amhr2 duplication-transposition event likely occurred after divergence from the common ancestor of ayu, northern pike and Salmonidae. This type of evolution of a sex-determining gene by gene duplication-transposition has also been reported for other teleosts, such as medaka, Patagonian pejerrey, northern pike, yellow perch and threespine stickleback [4,5,[10][11][12]14]. Our results raise the question as to whether translocation of the amhr2a-containing region from Pal20 to Pal26 and the duplication-transposition event of amhr2a to the Y-specific region were independent or linked events. Conserved syntenic analyses and association mapping to detect sex-linked loci in other smelt fish in the order Osmeriformes may resolve this question.

Models for the mechanism of sex determination in ayu
Müllerian-inhibiting hormone, or AMH, was first identified in mammals [62,63]. AMH belongs to the transforming growth factor beta (TGF-β) superfamily and induces degeneration of Müllerian ducts, which are female reproductive ducts of tetrapods, in male embryos during early gonadal development. AMH transmits signals by binding to AMHR2, a type II transmembrane receptor for members of the TGF-ß family [64].
In teleosts, amh/TGF-β signaling is also critical for sex determination, even though fish do not have Müllerian ducts [64]. In Patagonian pejerrey, northern pike and threespine stickleback, the sex-determining gene is amh on the Y chromosome [10,11,12]. In Oryzias luzonensis, another TGF-β, gsdfY (gonadal soma-derived factor on the Y chromosome) is the sex-determining gene [6]. The gsdf gene is a critical factor for male sex differentiation in many teleosts [65] and is one of the direct downstream targets of the sex-determining gene dmy in medaka [66]. In yellow perch, the Y-linked duplicate of amhr2 is a strong candidate for the sex-determining gene. In addition, the Y-linked amhr2 has a specific deletion in its extracellular domain, suggesting that Y-linked amhr2 and autosomal amhr2 have different functions [14]. In fugu, amhr2 was located on both Y chromosome and X chromosome and expressed in somatic cells of undifferentiated gonads in both sexes. Genetic sex is determined by an amhr2 paralog containing a missense SNP in its kinase domain that reduces its signaling activity [13]. Thus, we concluded that amhr2bY was more likely to be the sex-determining gene than the nine other genes located in regions containing strongly sex-linked SNPs.
Expression analyses indicated that amhr2bY mRNA expression was male-specific in somatic cells sounding germ cells in morphologically undifferentiated gonads, indicating that amhr2bY was expressed in supporting cells. In other teleosts, sex-determining genes are expressed in supporting cells of undifferentiated gonads [4,6,10]. In mammals, the Y-linked sex-determining gene Sry induces testis development by directing sexually undifferentiated supporting cells to develop as Sertoli cells, which then direct all other gonadal cell types, including germ cells, to secondarily initiate a male differentiation pathway [67,68]. Thus, the expression pattern of amhr2bY in ayu corresponded well with those of testis-determining genes in other vertebrates. Furthermore, loss-of-function mutation of amhr2bY induced male to female sex reversal, indicating that amhr2bY is critical for testicular development in ayu.
Our results indicate that Y-linked amhr2bY is the sex-determining gene in ayu, and male-specific expression of amhr2bY determines the genetic sex of ayu.
In medaka, amhr2 signaling plays an important role in germ cell-supporting cell interactions during early gonadal differentiation [69]. In ayu, transcripts of amh, the putative ligand of amhr2bY and autosomal amhr2a, were detected in both supporting cells and germ cells in morphologically undifferentiated gonads of both males and females. Autosomal amhr2a transcripts were detected in somatic cells of sexually undifferentiated gonads in both genetic males and females, suggesting that amh-amhr2 signaling functions in germ cell-supporting cell interactions and supporting cell-supporting cell interactions during early gonadal development, regardless of sex (Fig 9A).
In ayu, amhr2bY was expressed in autosomal amhr2a-expressing cells. Several models can be considered to explain how amhr2bY determines the genetic sex, when amhr2bY and amhr2a are expressed in the same cells simultaneously during early gonadal development. In the evolutionarily conserved ATP-binding site in the catalytic domain of serine/threonine kinases in the BMPR2 and AMHR2 family, the 267 th amino acid in Amhr2bY is glutamine (a polar amino acid), while hydrophobic amino acids are located at the same position in Amhr2 proteins of other teleosts, including Amhr2a. Four contiguous amino acids including Q267 (265-268 th amino acids) are evolutionarily conserved as the ATP-binding site in BMPR2 and AMHR2 family kinases in vertebrates. In human, a missense mutation at H282Q (equivalent to amino acid 268 in ayu Amhr2bY) is associated with the persistent Müllerian duct syndrome, which is characterized by the lack of regression of Müllerian derivatives in externally phenotypic males [70]. Thus, the kinase activity of Amhr2bY may differ from those of Amhr2a and Amhr2s of other teleosts. If both receptor genes bind to the same ligand, then the downstream signaling pathway may differ between Amhr2bY and Amhr2a, because Amhr2bY has a mutation in the kinase domain. Our results suggest that amhr2bY determines the genetic sex and initiates testicular development via a male-specific amhr2 signaling pathway in sexually undifferentiated supporting cells (Fig 9B). However, the extracellular domain of Amhr2bY contains some specific amino acid changes, such as S44 (glycine to serine), I102 (polar to hydrophobic) and N131 (hydrophobic to polar). This raises the possibility that Amhr2bY and Amhr2a bind to different ligands. In this model, Amhr2bY binds to an amhr2bY-specific ligand. Then, Amhr2bY and the specific ligand complex transmit the male differentiation signal and initiate testicular development. Amhr2a binds to the amhr2a-specific ligand and functions in both male and female gonads (Fig 9C). Another hypothesis is that the gene dosage of Amhr2s is critical for sex determination in ayu, if Amhr2bY and Amhr2a bind to the same ligand and have exactly the same function. In that case, the 2X gene dosage of amhr2 would induce ovarian initiates testicular development via a male-specific signaling pathway in sexually undifferentiated supporting cells surrounding germ cells. (C) Hypothesis that different ligands bind to Amhr2bY and Amhr2a. Amhr2bY binds to amhr2bY-specific ligand (ligand Y). Amhr2bY is activated by a specific ligand and initiates differentiation, whereas the 3X gene dosage (2 autosomes + Y chromosome) would induce testicular differentiation (Fig 9D). Further studies are required to investigate the details of the signaling pathway(s) via Amhr2bY and autosomal Amhr2a during early gonad differentiation to reveal the molecular mechanisms of sex determination in ayu. While amh/TGF-β signaling is critical for sex determination in various teleost fish, little is known about the difference in function between Y-linked amhr2 and autosomal amhr2. Based on genetic evidence and expression patterns, we presented the first models of sex determination by duplicated amhr2.
In summary, we sequenced the genome of ayu (P. altivelis) as a model species in the superorder Stomiati. Conserved synteny analysis between ayu, Neoteleostei and Protacanthopterygii indicated that the ayu genome has undergone specific chromosomal rearrangements, which appear to have occurred after the divergence of ayu and the Protacanthopterygii lineage. Genome-wide association mapping using wild populations identified sex-linked scaffolds and a duplicate copy of amhr2 located in one sex-linked scaffold. Y-linked amhr2 expression was male-specific in supporting cells surrounding the germ cells of undifferentiated gonads. Lossof-function analysis for amhr2bY indicated that amhr2bY is critical for gonadal sex determination ayu. These results indicate that amhr2 on the Y chromosome is the sex-determining gene of ayu.

Ethics statement
All animal experiments were carried out with the approval of the Institutional Animal Care and Use Committee of the Tokyo University of Marine Science and Technology (No. 92, H30-13).

Ayu sources
The draft genome sequence of ayu was obtained from one male individual captured from a wild population in the Ota River, Hiroshima, Japan (34˚21 0 39@N 132˚24 0 20@E). A single F 1 full-sib family (90 siblings and both parents) reared at Tokyo University of Marine Science and Technology derived from the Ota River population was used to construct the ayu genetic linkage maps. In total, 24 males and 24 females captured from a wild population in the Edogawa River (Matsudo city, Chiba, Japan, 35˚46'36"N 139˚53'28"E) were used for GBS analysis to map sex-determining loci. Ten males and nine females captured from a wild population in the Tama River (Ota ward, Tokyo, Japan, 35˚33'48"N 139˚40'43"E) were used for resequencing to map sex-determining loci. To confirm sex-linked scaffolds detected by GWAS, two F 1 fullsib family (88 siblings, two male parents and two female parents) were used for linkage analysis using GRAS-Di analysis. Parents of these families were captured from a wild population in the Nagaragawa River (Kakamigahara city, Gifu, Japan, E136˚51 0 N35˚24 0 ) and siblings were mixed and reared until the phenotypic sex could be determined. 161 males and 143 females from cultured strain derived from Ota River, 98 males and 119 females captured from a wild population in the Nagaragawa River, 43 males and 23 females captured from a wild population in the Tama River, 74 males and 31 females captured from a wild population in the Jinzukawa River (Toyama city, Toyama, Japan, E137˚13 0 N36˚42 0 ) and 11 males and 8 females cultured fish originated from Biwa Lake (Shiga, Japan, E135˚56'N35˚07') were used to develop PCR-

Genetic linkage map construction
Genomic DNA was extracted from the caudal fins of the ayu F 1 full-sib family (90 siblings and parents) using a DNeasy Blood and Tissue kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. DNA libraries were prepared using the restriction enzyme ApeKI according to the GBS protocol [81]. For GBS libraries, 96 samples (90 siblings, triplicate of male and female parents) were sequenced per lane on the Illumina HiSeq4000 platform with 100-bp paired-end reads.
Genetic linkage maps were constructed with the pseudo-testcross strategy using F 1 full-sib family in Lep-Map3 [27,85]. Three genetic linkage (female-specific map that constructed female parent informative markers and both parents informative markers, male-specific map that constructed male parent informative markers and both parents informative markers, and consensus sex-averaged linkage map) maps were constructed. The Filtering2 module in Lep-map3 was used to filter out markers with significant segregation distortion (dataTolerance = 0.001). The LGs were separated using the SeparateChromosomes2 module with a logarithm of odds (LOD) threshold of 10 (lodLimit parameter) and a recombination rate of 0.03 (theta parameter). The JoinSingles2All module was then used to assign additional single SNPs to existing LGs using an LOD threshold of 3. The LGs were ordered using the Order-Markers2 module with five iterations. Maternal and paternal linkage maps were constructed separately. The minimum number of markers per LG was set to 10. A sex-averaged linkage map was also obtained using Lep-map3 software. Linkage maps were confirmed by the position of SNP markers in the ayu genome sequence and conserved synteny relationship between other teleost genomes. Linkage map summaries were visualized using the R package Linkage-MapView [86]. Then, the ayu reference genome was ordered and oriented based on genetic linkage maps using ALLMAPS [29]. The female linkage map was assigned a weight of 2, and the male map a weight of 1. Chimeric scaffolds that mapped to different LGs were split using ALLMAPS. The split threshold was set to at least two markers.

Conserved synteny analysis
Conserved synteny analysis between ayu and the genomes of other teleosts, including Northern pike (E. lucius), rainbow trout (O. mykiss), Atlantic salmon (S. salar), medaka (O. latipes), zebrafish (D. rerio) and spotted gar (L. oculatus) were performed using Synima [93]. Orthologous genes were clustered by reciprocal best hits using predicted protein sequences. The minimum number of paired genes in a synteny block was set to 3. For conserved synteny analysis of the amhr2bY-containing region, putative orthologous gene pairs were complemented by blast searches manually. Syntenic relationships were visualized by Circos v0.69 and VGSC2 [94,95]. The northern pike, rainbow trout and Atlantic salmon genome sequences were obtained from the NCBI (accession numbers GCF_004634155, GCF_002163495 and GCF_000233375, respectively). The genome sequences of medaka, zebrafish and spotted gar were obtained from Ensembl Release 98 [96].

Genome-wide association study to identify sex-determining locus
Genomic DNA was extracted from the caudal fins of 24 males and 24 females captured from the wild population in the Edogawa River using a DNeasy Blood and Tissue kit (Qiagen) according to the manufacturer's instructions, and then used for GBS analysis [81]. For GBS libraries, 48 samples were sequenced per 1/2 lane on the Illumina HiSeq4000 platform with 100-bp paired-end reads. Sequencing data have been deposited in the DRA under the accession number DRA010914.
For whole-genome resequencing, genomic DNA was extracted from the caudal fins of 10 males and nine females captured from the wild population in the Tama River using a Blood and Cell Culture DNA Mini Kit (Qiagen) according to the manufacturer's instructions and then sequenced on the Illumina HiSeq X platform with 150-bp paired-end reads at approximately 12× coverage. Library construction and sequencing were carried out by BGI (Shenzhen, China). Sequencing data have been deposited at the DRA under the accession number DRA010915. Low-quality reads were trimmed by Trimmomatic v3.6 [72]. Trimmed reads were mapped to the ayu reference genome sequence by bwa mem v0.7.12 [82]. For resequencing data, duplicate reads were marked using Picard tools v2.1.18 (http://broadinstitute. github.io/picard/). Mapping coverage was calculated by deepTools v3.1.3 [97] and visualized with Integrative Genomics Viewer v2.3.68 [98]. Variants between individuals from the Edogawa River and the Tama River were simultaneously called with GATK HaplotypeCaller v4.0.5 [83]. The SNPs located in repetitive sequences detected by RepeatMasker were eliminated from analyses.

Linkage disequilibrium analysis of sex-determining locus
For linkage disequilibrium analysis of sex-associated scaffolds, pairwise r 2 values between SNPs detected by whole-genome resequencing with minor allele frequencies > 20% were calculated and plotted with the R package LDheatmap [103]. Sex-associated scaffold association tests were performed using plink v1.9 with the allelic model using Fisher's exact test. The scaffold-wide significance threshold was set to a Bonferroni-corrected p-value < 0.05.

Molecular cloning and expression analysis of candidate sex-determining gene of ayu
Total RNA was extracted from testes and ovaries at approximately 12 mpf from ayu from the Tama River population using an RNeasy Mini kit (Qiagen). The extracted RNA was used for first-strand cDNA synthesis with an Omniscript RT kit (Qiagen) with oligo-dT primers. The cDNA fragments of ayu amhr2bY and amhr2a were amplified by RT-PCR using PrimeSTAR HS DNA Polymerase (Takara Bio Inc., Otsu, Japan), with amhr2bY-1F and amhr2bY-2R primers to amplify amhr2bY and amhr2a-1F and amhr2a-2R primers to amplify amhr2a. These primers were designed based on the predicted coding sequence of each gene (S12 Table). The PCR conditions were as follows: 1 min at 95˚C, followed by 35 cycles of 10 s at 98˚C, 5 s at 55˚C and 2 min at 72˚C, with final extension for 5 min at 72˚C. The amplified fragments were cloned into a pGEM-T easy vector (Promega, Fitchburg, WI, USA) for sequencing of both strands on an Applied Biosystems 3730xl DNA Analyzer (Thermo Fisher Scientific, Waltham, MA, USA).
To obtain the full-length cDNA of the ayu candidate sex-determining gene, we carried out 5 0 and 3 0 RACE. SMART cDNA was synthesized using a SMART RACE cDNA Amplification kit (Takara Bio Inc.) according to the manufacturer's instructions with the gene-specific primers amhr2bY-3R for 5 0 RACE of amhr2bY, amhr2bY-5F for 3 0 RACE of amhr2bY, amhr2a-3R for 5 0 RACE of amhr2a and amhr2a-4F for 3 0 RACE of amhr2a (S12 Table). The PCR amplification conditions were as follows: 1 min at 95˚C; followed by five cycles of 10 s at 98˚C and 2 min at 72˚C; five cycles of 10 s at 98˚C, 5 s at 70˚C, and 2 min at 72˚C; five cycles of 10 s at 98˚C, 5 s at 68˚C, and 2 min at 72˚C; 28 cycles of 10 s at 98˚C, 5 s at 65˚C, and 2 min at 72˚C; and final extension for 5 min at 72˚C. For amhr2bY, nested PCR was performed using 1 μL of the product of the first PCR round, the gene-specific primers amhr2bY-4R and amhr2bY-6F, and the following cycling protocol: 1 min at 95˚C, followed by 35 cycles of 10 s at 98˚C, 5 s at 65˚C, and 2 min at 72˚C, with final extension for 5 min at 72˚C (S12 Table). The 5 0 and 3 0 RACE products were cloned into a pGEM-T easy vector (Promega) for sequencing. The 15-amino acid insertion in amhr2a was confirmed by RT-PCR and sequencing with amhr2a-7F and amhr2a-8R using different individuals from those used to clone full-length cDNA (S12 Table). To confirm that amhr2bY is Y chromosome-specific, the amhr2bY genomic region was amplified with the primers amhr2bY-1F and amhr2bY-2R from 12 males and 12 females from the Tama River population. These individuals were different from the ten males and nine females from the Tama River used in the association study.
For expression analyses, testes and ovaries of immature individuals from cultured ayu derived from the Ota River (n = 4 of each sex) were dissected and fixed in RNAlater solution (Qiagen). Total RNA was extracted using an RNeasy Mini kit (Qiagen). First-strand cDNA was synthesized using an Ominiscript RT kit (Qiagen) and amplified by Takara Ex Taq (Takara Bio Inc.) using the primer sets amhr2bY-1F and 2R, and amhr2a-1F and 2R. The PCR conditions were as follows: 1 min at 95˚C, followed by 35 cycles of 30 s at 95˚C, 30 s at 55˚C, and 2 min at 72˚C. As an internal control, β-actin mRNA was amplified using the primer set bact-F and bact-R under the same PCR conditions mentioned above, but with 28 cycles (S12 Table). The conditions for genomic PCR and RT-PCR for other candidate coding sequences, including primer sequences and thermal cycling conditions, are summarized in S13 Table. The PCR products were sequenced using the same primers.
The protein structures of Amhr2bY and Amhr2a were predicted using the SWISS-MODEL homology-modelling server [48]. The crystal structure of the kinase domain of human BMPR2 (PDB ID; 3g2f.1.A [104]) was used as the template for homology modelling. Predicted models were visualized in UCSF Chimera v1.14 [105].
To compare genomic sequences of the putative regulatory regions for amhr2bY and amhr2a, we performed pairwise genome alignments of the 30-kb genomic sequence around amhr2bY and amhr2a (approximately 10 kb of the 5 0 upstream region, the regions containing amhr2bY and amhr2a, and approximately 10 kb of the 3 0 downstream region) using lastz v1.03.72 (https://github.com/lastz/lastz) [106]. Putative transcription factor binding sites in the intergenic regions between amhr2s and their adjacent genes were predicted by CiiiDER v0.9 using the JASPAR2020 CORE vertebrates non-redundant database [107,108].

Molecular phylogenetic analysis
The deduced amino acid sequences of Amhr2bY and Amhr2a in ayu were aligned with amino acid sequences of Amhr2 proteins from other teleosts using Muscle in MEGA X [109]. Multiple alignments were visualized by Jalview with the ClustalX color scheme [110]. A phylogenetic tree of teleost AMHR2s was constructed in MEGA X using the maximum likelihood method.
Bootstrap resampling was repeated 10,000 times. The names and GenBank accession numbers of analyzed sequences are shown in S14 Table.

Development of PCR-based genotyping method for genetic sex identification
Genomic DNA was extracted from caudal fins using a DNeasy Blood and Tissue kit (Qiagen) according to the manufacturer's instructions. Genomic DNA was amplified by Takara Ex Taq (Takara Bio Inc.). The primer set Ayu-sex-1F and Ayu-sex-2R was designed based on sequences of exon1 and exon2 (S12 Table). The primer set Ayu-sex-3F and Ayu-sex-4R was designed based on sequences of exon8 and exon9 (S12 Table). The intron length of amhr2a was longer than that of amhr2bY and these amplicons were able to be separated by agarose gel electrophoresis. The PCR conditions were as follows: 1 min at 95˚C, followed by 30 cycles of 30 s at 95˚C, 30 s at 55˚C, and 1 min at 72˚C, with a final extension of 5 min at 72˚C. DNA fragments were separated by agarose gel electrophoresis using 1% (w/v) Agarose S (Nippon Gene, Toyama, Japan). Genetic males had two amplified bands at 872 bp and 1162 bp that were derived from amhr2bY and autosomal amhr2a, whereas genetic females had one amplified band at 1162 bp that was derived from amhr2a by Ayu-sex-1F and Ayu-sex-2R (S30A Fig). Likewise, genetic males had two amplified bands at 390 bp and 960 bp, whereas genetic females had one band at 960 bp amplified using Ayu-sex-3F and Ayu-sex-4R (S30B Fig). To confirm accuracy of genotyping method by amhr2bY, ayu from various population (cultured strain derived from Ota River, wild ayu in the Tama River, Nagaragawa Rive and Jinzukawa River and cultured fish originated from Biwa Lake) were genotyped using Ayu-sex-3F and Ayu-sex-4R.

Loss-of-function analysis of amhr2bY
The sgRNA for amhr2bY was designed using CCTOP [111]. Forward 5'-TAGGAG-TAAGGCTCGGACCAGT-3' and reverse 5'-AAACACTGGTCCGAGCCTTACT-3' oligo nucleotide for target sequences of amhr2bY were annealed and cloned into the BsaI site of the pDR274 vector (Addgene, Watertown, MA. catalog number: #42250). The sgRNA was transcribed using DraI-digested gRNA expression vectors as templates using the ScriptMAX Thermo T7 Transcription kit (TOYOBO). Then, sgRNA was treated DNase I and purified by RNeasy Mini Kit (Qiagen). The 150 ng of Cas9 Nuclease protein NLS (Nippon Gene, Tokyo, Japan) and 25 ng of sgRNA were microinjected into fertilized ayu one-cell eggs obtained from the cultured strain derived from the Ota River population using a microinjector BJ-110 (BEX, Tokyo, Japan). The Body of amhr2bY-targeted G0 ayu were fixed in Bouin's solution and analyzed histology of developing gonads. The head and tail of amhr2bY-targeted G0 ayu were used to extract genomic DNA. Mutations of target site for CRISPR/Cas9 were identified by genomic PCR using the primer set amhr2bY-11F and amhr2bY-12R for amhr2bY and amhr2a-11F and amhr2a-12R for amhr2a (S12 Table). The PCR products were cloned into a pGEM-T easy vector (Promega) and analyzed by Sanger sequencing to identify individual editing patterns. At least 8 clones per individual were examined. Red boxes indicate catalytic domain of the serine/threonine kinases, bone morphogenetic protein, and anti-Müllerian hormone type II receptors (amino acids 208-492 in Amhr2bY and 209-507 in Amhr2a). Arrowheads indicate amino acid residues in evolutionarily conserved ATP binding site in catalytic domain of serine/threonine kinases in BMPR2 and AMHR2 family. Red arrowhead indicates ayu Amhr2bY-specific amino acid change in ATP binding site. Blue arrowhead indicates ayuspecific amino acid change in ATP binding site. Blue arrows indicate Amhr2bY-specific amino acid changes. Green arrows and green line indicate amino acid changes associated with human persistent Müllerian duct syndrome [70]. Orange arrow indicates amino acid residues critical for amhr2 function, as identified in medaka mutant [69]. It was conserved between Amhr2Yb and Amhr2a of ayu and that of other teleosts. Black arrow indicates amino acid residue determining genetic sex of fugu [13]. This amino acid did not alter between Amhr2bY and Amhr2a. Blue line indicates Amhr2a-specific 15-amino acid insertion.   Table. The candidates for sex determining gene located in putative male specific regions. (XLSX) S10 Table. Prediction of transcription factor binding sites in putative promoter regions of amhr2bY, amhr2a, northern pike amhr2 and medaka amhr2. (XLSX) S11 Table. PCR-based genotyping for genetic sex identification by amhr2bY and amhr2a using various population of ayu. (XLSX) S12 Table. PCR primers used for cloning and expression analyses. (XLSX) S13 Table. PCR primers and conditions used to detect candidates for ayu sex-determining gene.

Supporting information
(XLSX) S14 Table. GenBank accession numbers for sequences used in phylogenetic analysis. (XLSX) S15 Table. Marker sequences of linkage map for genetic sex of family 1 derived from Nagaragawa River population. (XLSX) S16 Dr. Kazuharu Nomura (Japan Fisheries Research and Education Agency) for estimating genome size by nuclear DNA staining. We are grateful to the Gunma Prefectural Fisheries Experimental Station, Gifu Prefectural Research institute for Fisheries and Aquatic Environments and the Bureau of Construction Tokyo Metropolitan Government for providing ayu from wild populations. We thank Mallory Eckstut, PhD, and Jennifer Smith, PhD, from Edanz Group (https://en-author-services.edanzgroup.com/ac) for editing drafts of this manuscript.